Crayons: A Cloud Based Parallel Framework for GIS Overlay Operations
نویسندگان
چکیده
GIS vector-based spatial data overlay processing is much more complex than raster data processing. The GIS data files can be huge and their overlay processing is computationally intensive. Meager amount of work has been done on processing large volume of vector geospatial data through parallel/distributed computing, and none has been on cloud platforms. We have created Crayons system, which we believe to be the first such parallel framework over clouds for overlay analysis of two GIS layers of polygonal data in GML format. The Windows Azure cloud platform was a challenge as it currently lacks support for traditional distributed computing infrastructures such as MPI or map-reduce. This paper presents the basic design of Crayons framework and explores the amount of parallelism in GIS computation over Azure. We show how the computation underlying this application can be effectively partitioned into independent tasks, and how Azure communication and storage mechanisms can be utilized to distribute these tasks among processors (Azure workers). We report on how much scalability Azure platform affords to various computational and i/o phases, and point out various bottlenecks in both algorithms and the Azure platform. Our experimental results show excellent speedups of basic overlay computation, highlight possible need for a new, distributed representation and storage of GIS files, and promise further scalability over larger clouds and data files.
منابع مشابه
Cloud Computing for Fundamental Spatial Operations on Polygonal GIS Data
Efficient end-to-end parallel/distributed processing of polygon-based spatial data (also known as vector-based data) has been a long-standing research question in GIS community. The irregular and data intensive nature of the underlying computation has impeded the exploratory research in this space. We have created an open architecture based system named Crayons for Azure cloud platform using st...
متن کاملMPI-GIS : High Performance Computing and IO for Spatial Overlay and Join
Geo-spatial datasets are large and the related computations and analytics are computationally intensive. For certain GIS and Spatial Database applications, spatial overlay and join on two or more layers of geo-spatial data may be necessary. However, using sequential paradigm to process them is timeconsuming. For instance, it takes roughly 20 hours to compute the spatial join of a polyline table...
متن کاملBeyond Mapping III
Use a Map-ematical Framework for GIS Modeling — describes a conceptual structure for map analysis operations and GIS modeling Getting the Numbers Right — describes an alternative framework based on how the map values are retrieved to classify analytical operations. Options Seem Endless When Reclassifying Maps — discusses the basic reclassifying map operations Overlay Operations Feature a Variet...
متن کاملScientific High Performance Computing (HPC) Applications On The Azure Cloud Platform
Cloud computing is emerging as a promising platform for compute and data intensive scientific applications. Thanks to the on-demand elastic provisioning capabilities, cloud computing has instigated curiosity among researchers from a wide range of disciplines. However, even though many vendors have rolled out their commercial cloud infrastructures, the service offerings are usually only best-eff...
متن کاملCrayons: Empowering CyberGIS by Employing Cloud Infrastructure
Researchers in geographic information systems and science (GIS) have perceived large-scale vector-data computation as a challenge due to data intensity[1, 2]. When large volumes of data are deployed for spatial analysis and overlay computation, it is a time consuming task, which in many cases is also time sensitive. Since the creation of the National Spatial Data Infrastructure [3] by 1994 Pres...
متن کامل